Author: Dylan Lawless, PhD
Affiliation: Department of Intensive Care and Neonatology, University Children’s Hospital Zurich, University of Zurich.
h2030gc fastq file names:
<SAMPLE_ID>_<NGS_ID>_<POOL_ID>_<S#>_<LANE>_<R1|R2>.fastq.gz
and Illumina fastq header:
@<instrument>:<run number>:<flowcell ID>:<lane>:<tile>:<x-pos>:<y-pos> <read>:<is filtered>:<control number>:<sample number>
The two datasets (Sample ID: XYZ_STUDY_D.XYZ003_DNA and reference CH S2) of paired-end short reads were generated from the same human DNA NGS library protocol for clinical diagnosis of phenotype X. Clinical grade sequencing (ISO 15189 accredited) was used to generate whole genome sequence (WGS) data at the Swiss Multi-Omic Center (SMOC) (SMOC). The Illumina Novaseq6000 platform was used in combination with TruSeq DNA PCR-Free library preparation. Analysis was performed with reference to GRCh38. Handling of sensitive clinical data according to established SPHN/BioMedIT guidelines on the sciCORE platform as part of SwissPedHealth (SPHN/BioMedIT). Herein, we evaluate sample performance for use in clinical diagnosis.
To do: forward pdb from AutoDesrtuctR.
The analysis of variant interpretation performed by ACMGuru is summarised from the following evidence sources:
Figure: AutoDestructR Single Case - This figure includes the use of gene structure and functional data including sources from UniProt, protein structure data from PDB and AlphaFold.
Figure: Protein Pathway Construction
Whole Genome V1 Compared. This figure includes the use
of protein pathway construction from STRING (GO, KEGG, Reactome, etc.).
Figure: Combined GO
Plots - This includes biological protein pathway
information from GO. Figure: QQ
Plot Data from Joint Cohort Analysis - Contains the QQ
plot data from the joint cohort analysis of single variants.
Figure: Protein Pathway Network 22 - Contains the protein pathway identified as enriched in patients sharing the same biological mechanism as cause of disease.
To assess the quality of fastq data, FastQC was used. Full HTML reports for each file are linked below:
The results of FastQC were also assessed by use of fastqcr. The full HTML report is linked here: - Report assessment of FastQC
Fastq files were trimmed using TrimGalore with the use of cutadapt.
Reads were aligned to GRCh37 using BWA MEM and converted to bam format with samtools.
The alignment data was assessed using:
Qualimap full HTML report links: - Sample AH - Sample CH
| Metric | CH | AH |
|---|---|---|
| In total (QC-passed reads & + QC-failed reads) | 2011262 | 1999498 |
| Secondary | 15710 | 612 |
| Supplementary | 0 | 0 |
| Duplicates | 0 | 0 |
| Mapped (99.76% : N/A, 99.92% : N/A) | 2006501 | 1997929 |
| Paired in sequencing | 1995552 | 1998886 |
| Read1 | 997776 | 999443 |
| Read2 | 997776 | 999443 |
| Properly paired (98.64% : N/A, 99.69% : N/A) | 1968314 | 1992738 |
| With itself and mate mapped | 1986886 | 1996840 |
| Singletons (0.20% : N/A, 0.02% : N/A) | 3905 | 477 |
| With mate mapped to a different chr | 14488 | 1612 |
| With mate mapped to a different chr (mapQ>=5) | 8748 | 1426 |
This document’s source code is available from the GitHub repository.
All code used in this report is available on the GitHub repository.